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Abstract — As the Internet AS-level topology grows over time, 
some of its structural properties remain unchanged. Such time- 
invariant properties are generally interesting, because they tend 
to reflect some fundamental processes or constraints behind 
Internet growth. As has been shown before, the time-invariant 
structural properties of the Internet include some most basic 
ones, such as the degree distribution or clustering. Here we 
add to this time-invariant list a non-trivial property — fc-dense 
decomposition. This property is derived from a recursive form 
of edge multiplicity, defined as the number of triangles that share 
a given edge. We show that after proper normalization, the fc- 
dense decomposition of the Internet has remained stable over 
the last decade, even though the Internet size has approximately 
doubled, and so has the fc-density of its fc-densest core. This 
core consists mostly of content providers peering at Internet 
exchange Points, and it only loosely overlaps with the high- 
degree or high-rank AS core, consisting mostly of tier-1 transit 
providers. We thus show that high degrees and high fc-densities 
reflect two different Internet-specific properties of ASes (transit 
versus content providers), thus explaining strong fluctuations 
between degrees and fc-densities, and the related observation 
that random graphs with the same degree distribution or even 
degree correlations as in the Internet, do not reproduce its fc- 
dense decomposition. Therefore an interesting open question is 
what Internet topology models or generators can fully explain or 
at least reproduce the fc-dense properties of the Internet. 

Index Terms — Internet topology, network evolution, fc-dense, 
digraphs. 



I. Introduction 

The discovery of power laws in the Internet in 1999 0] 
came as a big surprise to many, and as a source of major 
disbelief to some. Even more surprising is that over the 
last decade, an increasing number of increasingly refined 
and complete macroscopic Internet topology measurements 
and data sources (with one exception, WHOIS) show that 
the power-law distribution of AS degrees in the Internet has 
remained exceptionally stable, i.e. time-invariant (2), (3J, Q], 
|5l , |6l , Q. One has to always keep in mind that these 
macroscopic measurements may miss a significant percentage 
of links. Indeed, there have been several studies supporting 
this expectation: huge percentages of links are reported as 
missing in |8|, (9), IfTDl. for example. These studies rely on 
proprietary data collected from only a few ASes. Therefore 
questions about statistical significance of the reported results, 
which are difficult or impossible to reproduce due to the 
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proprietary nature of the data, are well grounded. However 
these questions may be not so important in view of that 
all these studies report that a majority of missing links are 
peering, while customer-provider links appear to be covered 
well in macroscopic topology measurements. These results are 
expected, given an ease at which peering links are "set up" at 
Internet eXchange Points (IXPs), for example: as soon as an 
AS connects to an IXP and declares an open peering policy, it 
can exchange traffic with any other openly peering AS at the 
same IXP. It is instructive to compare the setup of such "links" 
with the process behind setting up customer-provider links, 
often involving complicated business decisions, negotiations, 
and payment agreements fTD . fl2l . 

The time-invariant nature of the power-law distribution of 
(customer-provider) degrees, as well as of strong cluster- 
ing, has recently found an interesting explanation in certain 
trade-off optimization drivers behind Internet evolution [13]. 
These drivers are truly fundamental and apply not only to 
the Internet, but also to social and biological networks, and 
even to spacetime in our accelerating universe 03]. More 
generally, any time-invariant property of an evolving network 
structure is a candidate to reflect some fundamental forces or 
constraints behind network evolution, unless this property is a 
simple statistical consequence of some other property — degree 
distribution, for example. These forces and constraints can be 
quite general, applicable to many different networks, as is the 
case with degree distribution and clustering, or they can be 
specific to a given network, in which case they likely reflect 
some specific functions that this network performs. 

Here we show that the Internet fc-dense decomposition is 
time-invariant, and explain this invariance via Internet- specific 
AS data reflecting different functions and business roles of 
different ASes. The fc-dense decomposition ITT51 of a graph 
G is a hierarchy of nested subgraphs Hk, Hk C Hk-i C G, 
k = 2,3, ... , Umax, induced by edges belonging to k — 2 
or more triangles within Hk- If a link belongs to Hk but not 
to Hk+i, it is said to have the fc-dense-index equal to k. The 
fc-dense-index of a node is the maximum fc-dense-index of its 
incident edges. All other definitions are in Section [HI] The 
fc-dense-index of an edge is thus a recursive variant of edge 
multiplicity, defined as the number of triangles in G that an 
edge is a member of. Edge multiplicity was introduced and 
studied in lfl6l . ifTTl . where it was also shown that the edge 
multiplicity distribution in many real networks is either power- 
law or fat-tailed, and that many existing network models fail 
to reproduce this property. The fc-dense decomposition of one- 
time Internet snapshots have been analyzed in [ 1 8 1 . Here we 
collect historical Internet topology data from May 2004 to May 
2012, Section HU to study Internet evolution from the fc-dense 
decomposition perspective. Our main results include: 

1) The maximum fc-dense-index Umax exhibits a clear 
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growing trend, increasing from 29 in 2004 to 48 in 2012, 
Section IIV-AI while the number of ASes in the densest 
core Hk MAX is small and fluctuating — it is 59 in 2004 
and 60 in 2012. The Umax growth can only partially be 
attributed to the growing average degree. Other factors 
are likely related to increasing open peering at IXPs: all 
the 60 ASes in the 2012 Hk MAX are present at at least 
one IXP, Section EA] 

2) After proper normalization, Section IIV-BI the Internet 
fc-dense decomposition is shown to be time-invariant in 
Section IIV-CI The Internet-specific interpretation of this 
result appears in Sections IV-BI and IV-CI 

3) A significant part of this interpretation is centered 
around the observation that the degree of an AS and 
its fc-density reflect two different Internet-specific prop- 
erties of the AS, two different AS business roles. Related 
results include: 

a) While some correlations between AS degree and 
fc-density are present as expected, the fluctuations 
between the two quantities are very strong, Sec- 
tion ITVT)1 

b) Random graphs having the same degree distribu- 
tion or even degree correlations as the Internet, 
do not have the same fc-dense decomposition, Sec- 
tion USH 

c) The analysis of PeeringDB data reveals that while 
high-degree or high-rank ASes tend to be tier- 
1 transit providers, high- fc-density ASes tend to 
be tier-2 and content providers peering at IXPs, 
Section ED 

4) The structure of densest core Hk MAX is statistically de- 
termined by its degree distribution alone, Section IIV-FI 
We interpret this result via open peering policies of 
participating ASes, which do not choose their peers 
based on their degrees, Section IV-DI On the contrary, 
selective peering policies, present elsewhere in the In- 
ternet, likely introduce degree correlations, providing 
a new interpretation of the main result in |fP9"1 , where 
the AS -level topology of the Internet was found to be 
statistically determined by its degree correlations. 

Some other related work dealing with Internet evolution 
includes [5], which evaluates how different structural proper- 
ties, related to the distributions of node degrees, centralities, 
path lengths, community structure, etc., change over time, 
from January 2002 to January 2010. The study relies on 
the Cramer-von Mises criterion to identify changes between 
the distributions, and finds that most distributions remain 
unchanged, except for the average path length and clustering 
coefficient. These changes are interpreted as a consequence of 
peering policy changes. The different growth dynamics of the 
IPv4 and IPv6 topologies from 1997 to 2009 are juxtaposed in 
[ 20 1 . The main result is that IPv4 topology growth had a phase 
transition in 2001, while IPv6 had a different phase transition 
in 2006. The authors of [21] focus on topology liveness and 
completeness problems, comparing different Internet topology 
measurement data sources for the period from January 2004 to 
December 2006. Two evolution trends are highlighted in this 



work: a) customer networks are the major cause of the overall 
topology growth, b) transit providers tend to form increasingly 
denser structures. The monumental study [3 1 analyzes the evo- 
lution of customer-provider connections in the Internet from 
January 1998 to January 2010. AS links and nodes are labeled 
by business relationships and roles, and studied separately. 
The authors find that enterprise networks and content/access 
providers at the periphery are the main contributors to the 
overall Internet growth. They also study rewiring activity, and 
find that content/access providers appear as most active in 
that regard. The evolution of the fc-core decomposition of the 
Internet from December 2001 to December 2006 is studied 
in l22l . Similar to the fc-dense decomposition that we analyze 
here, the fc-core decomposition also appears time-invariant 
according to |22|. However, contrary to the maximum fc- 
density, the maximum fc-coreness does not exhibit any growing 
trend. To the best of our knowledge, it remains unclear if these 
results in general, and AS coreness in particular, can find any 
Internet-specific interpretations. 

II. Data 

We use the AS -level topology data from the UCLA Com- 
puter Science Department's Internet Research Lab (IRL) ll23l . 
11241 . We collect yearly Internet snapshots dated from May 
2004 to May 2012. Specifically, for each year, we download 
the data corresponding to May 31st, and then discard all the 
links with the last seen attribute older than May 1st. Table U 
reports the numbers of AS nodes and links in each snapshot. 

We emphasize that BGP and traceroute-based data provide 
incomplete and biased views of the real Internet AS-level 
topology. For example, many connections between leaf (low- 
degree) ASes are hidden from monitors located in hub (high- 
degree) ASes [ 25 1 . Changing numbers of monitors introduce 
various artifacts and aberrations in the observed topologies. 
Since only the best paths are typically announced and used 
for routing, the observed views are highly incomplete. Specific 
to traceroute measurements, there are many non-responding 
ASes, especially leaf ASes comprising a majority of Internet 
ASes. There are also many issues with mapping IP addresses 
to AS numbers, including IP addresses for which there are 
multiple or no mapping ASes, and many other vagaries [26]. 

Our decision to use the IRL BGP data was motivated 
primarily by the observation that the number of ASes in 
this dataset is relatively consistent with the number of the 
Advertised AS Count growth in the CIDR report fl27l . 

III. fc-DENSE DEFINITIONS 

The fc-dense decomposition is a recursive graph decom- 
position fT31 . based on edge multiplicity 1161 . ifTTl . The 
multiplicity mdhj) of edge in graph G is the number of 
triangles in G containing the edge, or equivalently, the number 
of common neighbors of connected nodes i and j, see Figure 
[T] By definition, the fc-dense subgraph B.% of graph G is the 
subgraph induced by all the links with multiplicity larger or 
equal to fc — 2 in the subgraph: 

m H „(i,j)>k-2. (1) 
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TABLE I 

Number of AS nodes and links. 



2004 2005 2006 2007 2008 2009 2010 2011 2012 

Nodes 177858 20,486 23,044 26TT01 29,042 32,379 35383 38,888 42,419 
Links 50,326 59,382 66,781 79,931 90,588 102,362 113,846 127,558 146,271 



This subgraph can be obtained from G by iterative pruning all 
the links with multiplicity smaller than fc — 2. All the links in 
Hk have multiplicity equal or greater than k — 2 in G as well, 

m H k (i,j) > k — 2 =>■ mc(j,j) > k — 2, but the converse is 
not generally true. That is, links with ma(i,j) > fc — 2 are 
only candidate Hk links. 

NEIGHBOR -(2) NEIGHBOR - (fc - 3) 

NEIGHBOR -(1) \ / NEIGHBOR -(k -2) 



NODES A 



average degree 



-e- 




Fig. 1. Link i,j of multiplicity m(i,j) = k — 2. 

The fc-dense decomposition of G is a set of nested subgraphs 
if fcMAX C . . . C fffc+i C H k C ... CG, where 

MAX 1S 

the smallest and densest non-empty sub-graph. 

A link is said to have the k-dense-index equal to k* if it 
belongs to Hk- but not to Hk*+i- The set of all the links with 
the fc-dense-index equal to k* is called the fc*-dense-shell. 

A node is said to have the k-dense-index equal to k* if k* 
is the maximum fc-dense-index of its incident links. The set 
of all the nodes with fc-dense-index equal to k* is called the 
fc*-dense-set. 

Simply put, the fc-dense decomposition shows how densely 
connected the nodes are, and how nodes belonging to different 
fc-dense sets interconnect. Compared to similar fc-core j28 1 or 
fc-clique [29] decompositions: a) the fc-dense decomposition 
suggests a stronger relationships among nodes belonging to 
the same fc-set; b) the fc-dense decomposition is less sensitive 
to noise than the fc-clique decomposition; c) from a computa- 
tional standpoint, the fc-dense decomposition is slightly more 
complex than fc-core but much less complex than fc-clique. 

IV. Internet's fc-DENSE decomposition 

Treating the data described in Section [EI] as a sorted set of 
AS-level graphs ordered by year, we report in this Section 
the results of the statistical analysis of Internet's fc-dense 
decomposition and its evolution over time. The interpretation 
of these results is deferred to Section [V] 

A. Basic trends 

Similar to other studies, e.g. ED . [01 , we have not observed 
and do not report any significant changes in the basic graph 
properties — degree distribution, degree correlations, cluster- 
ing, betweenness, and shortest path distributions — even though 



Fig. 2. Internet growth in terms of numbers of nodes, links, /cm AX -dense- 
index and average degree. The black triangles show N(t)/N(to), the number 
of nodes in the graph at time t divided by the number of nodes at time 
to = 2004, TV (to) = 17,858. The black rhombuses show the same ratio 
for the number of links, M(t) /At (to), M (to) = 50, 326. The black squares 
are kMAx(t)/kMAX (to), kMAx(to) = 29. The black circles show the 
average degree of the graph at time t divided by the average degree at time 
to = 2004, k(to) = 5.64. The average degree grows logarithmically with 
the Internet size, k S3 a In N — b, with a = 1.3 and b = 7.5. The empty 
circles are (alnTV(t) — b)/(a In N(to) — b), showing the quality of this 
approximation. 



the graph has grown significantly, Table Q] and Figure [2] We 
see that the number of links M has been growing faster than 
the number of nodes N, meaning that the average degree 
fc = 2M/N has been increasing. This increase appears to be 
a logarithmic function of 7Y, fc w a In N — b, with a = 1.3 and 
b = 7.5, in agreement with fOl . iTPfl . The increasing trend 
of kjMAX growing from 29 in 2004 to 48 in 2012 indicates 
that more densely connected parts in the Internet have been 
forming in the course of its growth. 



B. k-dense normalization 

For each snapshot we first compute the number of nodes and 
links with a given fc-dense-index, i.e. the number of nodes in 
each fc-dense-set and the number of links in each fc-dense-shell. 
Since the graph in the snapshots and its fcjuAX are growing, 
to be able to properly compare the nine snapshots we next 
perform the following normalization, mapping fc-dense-indices 
and corresponding numbers of nodes and links to fractions 
with values between and 1: 

• x-axis normalization: map each index fc in each snapshot 
to what we call the k-dense-index fraction: 



k-k 
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km ax — kMiN ' 
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where Hmin an d ^-max are the minimum and maximum 
values of the fc-dense-index in the snapshot; 
> y-axis normalization: divide the corresponding number of 
nodes or links by the total number of nodes or links in 
the snapshot. 

In all the considered snapshots there are no nodes of degree 
0, so that kMiN — 2 for links and nodes in all the snapshots. 
This means that x = corresponds to the fc-dense-index equal 
to 2, while x = 1 corresponds to the fc-dense-index equal 
to Umax, which grows with time. Upon such normalization 
it becomes difficult to read off the absolute values of fc- 
dense-indices, nodes, or links from the resulting plots, but this 
normalization is helpful to see if there are any size- and time- 
invariant statistical trends. 

C. k-dense decomposition 

In Figures [3a] and [3b] we report the normalized fractions 
of nodes and links for each fc-dense-index, averaged over the 
nine historical snapshots. These average values are highly 
representative for each snapshot from 2004 to 2012, in the 
sense that the distributions of the actual values across the 
snapshots are quite narrow, as indicated by the 80% percentile 
and min/max deviations from the averages in the figures. In 
other words, the shown normalized statistics are fairly stable 
over time, i.e. time- and size-invariant. In particular, a vast 
majority of nodes and links in all the snapshots have low fc- 
dense-indices, not belonging to any dense communities. 

Similar to the joint degree distribution, showing how nodes 
of different degree interconnect [19], we next focus on how 
different fc-dense-sets interconnect. In Figure [3c] we report the 
normalized fraction of links attached to nodes in a given fc- 
dense-set. Similarly to Figures l3al and [3b] we observe that this 
statistics is also stable over time. We also observe that there 
are two classes of fc-dense-sets to which a vast majority of 
all links are attached — those with the smallest and largest fc- 
dense-index fractions close to and 1. Upon examination of 
the original data we find that the left peak in Figure [3c] is 
formed by the links attached to the 2- and 3-dense sets, while 
the right peak is due to the links attached to the kuAX -dense 
set. This observation motivates us to further focus on the links 
attached to these three sets, and show where their other ends 
go in Figures |3d][3f| 

We see that 2- and 3-dense-attached links go mostly to 
other low-dense-index nodes and to the nodes in the Umax- 
dense-set, and that kMAX -dense nodes direct a considerable 
percentage of their connections not only to nodes in low- 
dense-sets, but also to nodes in densely-connected parts of the 
network. We also notice that all the considered normalized 
statistics appear to be quite stable over time, as indicated by 
the narrow 80% percentiles. 

Given that a vast majority of links are attached to nodes 
in 2-, 3-, or kMAX -dense sets, we further characterize these 
sets in Table [n] The significant percentages of links attached 
to 2- and 3-dense sets are not surprising in view of that a 
majority of nodes belong to these sets, Figure [3a] We see in 
Table HTl that these peripheral sets are populated with nodes of 
low average degree and betweenness centrality of the order of 



TABLE II 

Average properties of nodes in the 2-, 3-, and & a/ ax -dense-sets: 
average degree k, average clustering coefficient c, and 
average betweenness centrality b. 





k 


c 




b 


2-dense-set 


1.643 





2.1 


■ 10^ 


3-dense-set 


3.161 


0.746 


9.3 


■ 10 3 


kMAX -dense-set 


403.8 


0.194 


3.0 


■ 10 6 



the number of nodes N in the graph. The kuAX -dense-set, on 
the other hand, consists of a small number of nodes with high 
average degree and betweenness of the order of N 2 . That is, 
these nodes have a central position in the network, they are a 
key element shaping the overall connectivity, with many links 
attached to them, Figure [3c] which motivates us to further 
analyze the structure of this kuAX -dense subgraph. However 
before doing so in Section IIV-FI we first answer the following 
two natural questions: 

1) Does the degree of a node fully define its fc-dense-index, 
making this index a repetitive statistics providing no new 
information compared to node degree? 

2) Do random graphs having the same degree distribution 
or degree correlations as the Internet, fully reproduce all 
its fc-dense properties, making them statistically insignif- 
icant, that is, casting them as statistical consequences of 
that this graph has the observed (joint) degree distribu- 
tion? 

D. k-dense-index versus node degree 

If all nodes of the same degree have the same fc-index, 
then one determines the other, so that the second statistics has 
no independent value. In Figure [4] we show the relationship 
between fc-dense-indices of nodes and their degrees in the 
2012 snapshot. As expected the average degree of nodes in a 
given fc-index-set as a function of fc exhibits a growing trend, 
however the fluctuations are high. In particular, the degrees 
of nodes in the 2-dense-set vary from 2 to 45, while the 
degree of nodes in the fcjv/AX-dense-set (kMAX = 48) vary 
from 103 to 2964. The other way around, nodes of degree 
100, for example, have fc-indices ranging from 12 to 45, and 
the node with maximum degree 174 (Cogent Communications 
inc., fc = 3655) has the fc-dense-index equal to 32 < kuAX- 
We conclude that the fc-index is not fully determined by the 
node degree, an observation that gains further support and 
Internet-specific explanation in Section [V] 

E. Statistical significance of the k-dense properties 

Even though the node degree does not fully determine the 
node fc-index, it may still be the case that random graphs 
having the same degree distribution as the Internet, also 
have exactly the same fc-dense properties, meaning that these 
properties are nothing but statistical artifacts, as it turned out to 
be the case with the rich club connectivity in the Internet [30]. 
To check if a similar story applies to the fc-dense properties, we 
construct dK -random graphs [19] for d = 0, 1, 2 as described 
in Section ITV-FI We generate 10 realizations for each d. These 
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Fig. 3. Normalized fc-dense decomposition of the Internet. The x-axes are linearly binned with the bin size equal to 0.05, and each plot shows the average 
(black dots and solid lines), 80% percentile (vertical bars), and minima and maxima (dashed lines) of the corresponding values computed across the nine 
historical snapshots of the Internet. |(a)| is the fraction of nodes with a given i-dense-index, i.e. the number of nodes in the &-dense-set divided by the total 
number of nodes in the corresponding snapshot. |(b)| is the fraction of links with a given fc-dense-index, i.e. the number of links in the &-dense-shell divided 
by the total number of links in the corresponding snapshot. |(c)| is the fraction of links attached to a given fc-dense-set, i.e. the number of links whose one or 
two ends are attached to nodes in a given fc-dense-set, divided by the total number of links in the corresponding snapshot. |(d)| is the number of links with 
one end attached to a 2-dense-set node and with the other end attached to a fc-dense-set node, divided by the number of links attached to the 2-dense-set. |(e)| 
and | (f) | show the corresponding fractions for links attached to 3- and &m AX -dense-sets. 
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Fig. 4. Distributions of node degrees in fc-dense-sets in the 2012 Internet 
snapshot. The figure shows the averages, 80% percentiles, and min/max values 
for each of the 48 distributions. 



graphs are random graphs with the same average degree, 
degree distribution, or joint degree distribution as in the 
Internet 2012 snapshot. In Table Hill we report the distributions 
of the kMAX values in these random graphs for each d, while 
in Figure [5] we show the fc-dense link decompositions of these 
graphs, and juxtapose them against the Internet's. We observe 



TABLE III 

&MAX -DENSE-INDEX IN INTERNET 2012 AND ITS /^-RANDOMIZATIONS 

for d = 0, 1, 2. The table shows the averages k MAX and 

STANDARD DEVIATIONS <T OF THE DISTRIBUTIONS OF THE kMAX -INDEX 
VALUES IN dA"-RANDOM GRAPH INSTANCES FOR d = 0, 1, 2. 





kMAX 


a 


Internet 


48 




0A"-random 


3 





lA'-random 


68 


4.05 


2A"-random 


44.3 


0.46 



that neither degree distribution nor joint degree distribution 
fully reproduce the fc-dense properties of the Internet, meaning 
that these properties have their own statistical significance, 
which is another observation that finds a natural Internet- 
specific explanation in Section [V] Yet another property that 
finds an explanation rooted in Internet specifics in that section 
is an unexpectedly simple structure of the fcjvf ax -index core 
that we analyze next. 

F. Structure of the k max -dense core 

Given the importance of H^ MAX , the densest and innermost 
subgraph core, Section ITV-CI we next characterize its structure 
in more detail. In Table [IV] we show the basic properties of 
Hk M Ax subgraphs in all the nine snapshots. We see that these 
subgraphs are quite small and dense (link density D = 1 in 
complete graphs). 
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Fig. 5. Normalized fc-dense decomposition of links in Internet 2012 and 
its dfC-randomizations for d = 0,1,2. The figure is similar to Figure l3bl 
and shows the averages (data points and solid lines) and 80% percentiles 
(vertical bars) of the distributions of fc-dense link fractions in dA"-random 
graph instances for d = 0, 1, 2. 



TABLE IV 

Basic properties of k MAX -dense cores in the nine historical 

SNAPSHOTS: N THE NUMBER OF NODES; M THE NUMBER OF LINKS, 

D = 2M/N/(N - 1) Ri k/N the link density. 



Year 


kMAX 


N 


M 


D 


2004 


29 


59 


1,381 


0.807 


2005 


33 


55 


1,318 


0.888 


2006 


32 


44 


872 


0.922 


2007 


34 


97 


3,159 


0.678 


2008 


40 


63 


1,763 


0.903 


2009 


37 


55 


1,353 


0.911 


2010 


39 


60 


1,595 


0.901 


2011 


42 


81 


2,745 


0.847 


2012 


48 


60 


1,703 


0.962 



Similar to the previous section, we next perform the stan- 
dard dK -statistical analysis lfl9l of these subgraphs. The pro- 
cedure is as follows. First we focus on the latest 2012 snapshot, 
extract the Hk MAX subgraph from it, and treat this subgraph 
as a separate graph. In particular, node degrees are computed 
within Hk MAX . Then we construct 20 random graphs having 
exactly the same average degree as this graph, using the 
standard Qm.m (Erdos-Renyi ll3Tl ) construction procedure of 
throwing M edges onto N(N — l)/2 node pairs uniformly at 
random. These graphs are called OK -random graphs. Then we 
construct 20 random graphs having exactly the same degree 
sequence as Hk MAX usm g the fast generalized Havel-Hakimi 
algorithm from ||32l . This algorithm is guaranteed to always 
quickly succeed as soon as the degree sequence is graphical, 
i.e. realizable. The resulting graphs are called lA"-random 
graphs. Then we compute the basic structural graph properties 
that in neither OK- nor 1 if -random graphs are guaranteed to 
be the same as in the original Hk MAX graph. The results are 
in Figure [6] 

The average neighbor degree is not guaranteed to be cap- 
tured by lif-random graphs. Only 2A"-random graphs, having 
exactly the same joint degree distribution as the original graph, 
guarantee to reproduce this degree correlation statistics. Yet we 
find that IK -random graphs have the same average neighbor 
degree as Hk MAX - Clustering is guaranteed to be fully cap- 
tured only by 3i\~-random graphs, reproducing the frequencies 



of triangular subgraphs. Yet we find that liv-random graphs 
have the same clustering as Hk MAX ■ The frequencies of motifs 
of size 3 or 4 are guaranteed to be fully captured only by 3K- 
or 4A"-random graphs, the latter reproducing the frequencies 
of subgraphs of size 4 by definition, but we find that there 
are no statistically significant deviations of the frequencies of 
these motifs in li\~-random graphs from the corresponding 
counts in Hk MAX . Finally the global betweenness or shortest 
path length distributions are not guaranteed to be captured 
by <ii\~-random graphs with any d < N, but again we see 
that these global statistics are well reproduced in lif-random 
graphs, although not as well as the local statistics. At the same 
time none of the considered structural properties is closely 
approximated by Oil -random graphs (except for the shortest 
path length distribution). 

Collectively these observations imply that even though the 
Hj. MAX core is so dense (link density D = 0.962) and close 
to being a complete graph that one could expect it to be 
well approximated by 0A"-random graphs, this expectation is 
actually wrong, but li\~-random graphs capture closely the 
basic structural properties of this core. We applied the same 
diC-analysis to the Hk MAX cores in all the other snapshots, 
and found them all to be liT-random as well. In other words, 
lif -randomness of the Hk MAX core is another time-invariant 
property. 

V. Interpretation of Internet's fc-DENSE properties 

In this section we interpret and explain the statistical results 
from the previous section using specifics of the Internet, where 
nodes are ASes and links are business relationships between 
ASes. That is, a link between a couple of ASes is a business 
agreement between the two organizations that enables these 
networks to exchange traffic. Recall that business relationships 
can be very roughly split into two classes: provider-customer 
and peering. Providers announce all the destinations to their 
customers, and thus forward all the traffic that their customers 
forward to them. Peers mutually announce a limited set of 
destinations, typically just their own destinations and their 
customer networks. Providers often charge their customers 
using the 95th percentile measurement schema lfl"2"l . i.e. the 
cost of the service depends on the amount of traffic exchanged. 
Peering is usually free of charge (unless maintenance costs are 
considered). The setup of public peering is greatly simplified 
by Internet eXchange Points (IXPs), facilities where each 
participant AS can create a single peering connection to any 
other participants that accept to peer (open peering policy), 
or to a specific subset of those {selective/restrictive peering 
policy). In what follows we explain the main observations from 
Section JV] in view of these Internet-specific realities. 

A. Growth of the )zm ax -dense index 

Even though the size of the Hk MAX core fluctuates over 
time, Table ITVl the km ax -dense index steadily grows, Figure 
|2] In this section we provide some evidence that this growth 
is correlated with (if not due to) the proliferation of IXPs. 

The first piece of evidence is the structural change of the 
Internet topology in the last decade due to different growth 
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Fig. 6. Basic structural properties of the 
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distribution of the corresponding occurrences computed across different dK-mn&om graph instances. 



s 



rates of peering and provider-customer relationships. The 
original economic driver behind peering is to reduce costs 
that customers pay to their providers. However, as the price of 
transit has been steadily decreasing, this driver becomes less 
prominent, unless large volumes of traffic are exchanged. As 
shown in IfTTj . peering grows fast, contributing much more to 
middle AS tiers, compared to the tier-1 ASes. Large Content 
Providers (CP) and Content Delivery Networks (CDN), which 
generate a significant percentage of the total Internet traffic 
ifTTI . Il33l are primary drivers behind this process because: 

1) a shorter path between these networks and subscribers 
provides better performances; 

2) although the traffic exchanged is highly asymmetric, for 
most ISPs the connection to the content providers is 
vital. 

Furthermore, [33] asserts that a dominant percentage of AS- 
level traffic flows directly between large CPs, CDNs, data 
centers, and consumer networks, and that 150 ASes originate 
more than 50% of the Internet inter- AS traffic. Reference [10] 
also supports our claim by describing the ground truth behind 
a large European IXP: the authors shows that the amount 
of peering links established in this facility is unexpectedly 
huge — about 50, 000 peering links among approximately 350 
AS members. Yet another piece of evidence can be found in 
[ 18 1: the percentage of ASes that are members of at least one 
IXP within a given /c-dense set is a rapidly growing function 
of k. In addition, all the kuAX -dense ASes are members of 
at least one IXP. 

To further support our claim here we perform analysis 
similar to 1 18 1. We collect the May 2012 information about the 
60 ttMAX -dense ASes in our 2012 snapshot from PeeringDB 
1 34|, a project maintaining a database aimed at facilitating the 
exchange of information related to peering: "what networks 
are peering, where they are peering, and if they are likely to 
peer with you". We find that 58 out of the 60 ASes have a 
peering record in the database, and that all these 58 ASes 
are members of the Deutscher Commercial Internet Exchange 
(DE-CIX) l35l . one of the largest Internet Exchange Points 
worldwide, located in Frankfurt, Germany, with the current 
membership count of more than 450 ASes. We investigate the 
profile of the remaining 2 ASes, and find that they both are 
also DE-CIX members: one is a telecommunication company 
with an unknown peering policy, the other is a hosting service 
provider with an open peering policy. 

B. The 2- and 3-dense-sets 

Nodes with the &-dense-index equal to 2 or 3 are a vast 
majority of all ASes in the Internet. These peripheral ASes 
contribute most to the overall network growth: they have 
small degrees, but a large percentage of all connections in 
the Internet are attached to these ASes ED, 0. All ASes 
of degree 1 and all ASes with zero clustering belong to the 
2-dense-set. These customer ASes connect to the Internet, 
but their business is not Internet-driven: they set up their 
connections to obtain Internet connectivity, thus all they need 
is a transit provider. Sometimes, for backup, higher availabil- 
ity, or other purposes, they set up agreements with multiple 



provider, i.e. they purchase transit from more than one provider 
(multihoming). If the two providers of a multihomed AS 
happen to be connected, then a triangle is formed, so that 
the multihomed AS belongs to the 3-dense-set. 

C. The kpiAX-dense-set vs. high-rank ASes 

The kMAX-dense ASes form the densest-connected com- 
munity by definition. The easiest way to establish such dense 
connectivity in practice is by connecting to a large IXP 
and declaring an open peering policy. Given that CPs and 
CDNs benefit from peering with any willing-to-peer ASes 
[37 1, it is quite plausible that CPs and CDNs are main 
players behind the formation of this densely interconnected 
substructure. Surprisingly, the adoption of an open peering 
policy is an emerging phenomenon among Network Service 
Providers (NSPs) 11381 . i.e. tier-2 ASes provided with an own 
backbone network that purchase transit from an upstream 
provider and resell it to other ASes. Although these ASes 
usually adopt a selective peering policy, as they do not want to 
peer with potential customers, such peering connections help 
tier-2 ASes to provide a better end-user experience to their 
customers l39l . 

The PeeringDB data shown in Tables [\^fa)| and [\] [d)| confirm 
this statement. Indeed, large percentages of the kuAX -dense 
ASes: 

• can be considered as Content or Network Service 
Providers, 

« direct most of their traffic outbound, and 

• have an open peering policy. 

The percentage of ASes with selective peering policies (almost 
all NSPs) is also significant, but all these ASes are good 
candidates for selective peering as well, explaining their high 
/c-density. Indeed, one commonly considered aspect in the peer 
selection process is the symmetry of the exchanged traffic [40]. 
We do not have access to traffic statistics, but we can use 
the number of IP addresses in the customer cone [41] as a 
proxy. The customer cone of an AS is the set of ASes that can 
be reached from the AS following only provider-to-customer 
links. In other words, it is the set of destinations that can be 
reached for free upon peering with the AS. The distributions 
of the customer cone sizes (the numbers of /32 IP addresses in 
the customer cone, to be precise) shown in Figure [7] indicate 
that the customer cone sizes of /cMAJf-dense ASes are large, 
but their distribution is narrower that in the rest of the Internet, 
thus making k^j ax -dense ASes potential peer candidates even 
in the selective peering case. 

We next juxtapose this kMAX -dense-set against different 
sets of high-rank ASes, in particular high-degree ASes, shed- 
ding more light on the different Internet-specific meanings of 
the /c-dense-index and node degree. 

Tier-1 is defined as a set of ASes that do not need to pay 
any transit providers to reach any destination. To accomplish 
this, all the tier-1 ASes connect to each other forming a clique. 
This structure ensures that all these ASes have high, although 
not necessarily highest £-dense-indices. However, because they 
provide routes to all the destinations in the Internet without 
paying any upstream providers, they have largest customer 
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TABLE V 

AS PROPERTIES FROM PEERINGDB . THE k MAX -dense COLUMN REFERS TO THE 60 k MAX -DENSE ASES IN THE 2012 SNAPSHOT, WHILE THE 
degree-rank AND AS-rank COLUMNS REFER TO THE 60 HIGHEST-DEGREE AND HIGHEST-AS-RANK ASES (36) . THE DATA FOR THE HIGHEST- ADDRESS- 

AND PREFIX-RANK ASES ARE SIMILAR TO THE HIGHEST-AS-RANK ASES . 
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Fig. 7. Distributions of customer cone sizes for the /cm AX -dense ASes (black 
circles) and all ASes in the Internet (empty squares). 

cones fiTl . (8|. Since the main business role of tier-1 ASes 
(or more generally, of any transit provider) is to sell transit, 
they have a lot of connections to enterprise customers, Figures 
[3d]and[3e] but the open peering policy would not increase their 
revenue. Therefore, the tier-1 ASes are more likely to adopt 
restrictive peering policies. These considerations suggest that 
these ASes, contrary to a common belief, do not largely belong 
to the densest community, i.e. to Hk MAX - In particular, the 
ASes with the largest customer cones are not kMAX-dense, 
Figure Q 

In addition, we compute the overlaps between the set of the 
60 ASes in the Hk MAX core of the 2012 snapshot, and the 
top-60 ASes ranked by their customer cone size. The results 
in Table I VII show that these overlaps are not substantial. In the 
same table we also show the overlap between the /cmajc -max 
dense ASes and the top-60 highest-degree ASes. From Figure 
|4]we know that a high degree does not necessarily mean a high 
fc-dense-index. In Table [VI] we find that the overlap between 
these two AS sets is about 50%, and if we look back at Table 
[Vb . we observe that these highest-degree or highest-rank ASes 
have quite different properties, compared to the kpj ax -dense 
ASes. Specifically, the former AS sets have 

■ a higher percentage of global ASes, Table IV|fb)] 



TABLE VI 

The 60 /cmax-dense ASes in the 2012 snapshot vs. top-60 

HIGH-RANK ASES. THE TABLE SHOWS THE OVERLAPS BETWEEN THE SET 

OF ASES IN H kMAX AND: address-rank, prefix-rank, AND AS-rank: THE 
sets of 60 ASes with the largest numbers of IPv4 /32 addresses, 

IPv4 ROUTING PREFIXES, AND ASES IN THEIR CUSTOMER CONES; AND 
degree-rank: THE SET OF 60 HIGHEST-DEGREE ASES. ALL THE DATA ARE 
FROM (36). 

k-max dense | address-rank | prefix-rank | AS-rank | degree-rank 
60 I 4 I 6 I 7 I 31 



• a higher percentage of 10Gbps+ ASes, Table rVjTc)] 

• a lower percentage of mostly outbound ASes, Table I V]f d)] 

• a higher percentage of ASes having a restrictive peering 
policy, Table PvfTe)] and 

• a lower percentage of ASes having an open peering 
policy, Table IV|le)| 

Simply put, high-degree or high-rank ASes tend to be very 
large transit providers, while Icmax -dense ASes tend to be 
either content providers or tier-2s. 

D. IK -randomness of the Hk MAX core 

If an AS has an open peering policy, it simply sets up a 
certain number of connections (equal to its degree), without 
choosing its neighbors in any way, e.g. based on their degrees 
or any other properties. On the contrary, an AS with a se- 
lective/restrictive peering policy always chooses its peers and 
one of the topological property emerging from this selection 
process is the degree of the peering candidate, correlated with 
its customer cone size. In the latter case, we thus cannot 
expect that the degree distribution alone is sufficient to fully 
describe the graph. Some non-trivial degree correlations must 
be present in it, and indeed the Internet AS -level graph as 
a whole was found to be not IK- but 2-fC-random in |19|. 
The large percentage of open-peering ku a x -dense ASes, and 
the similarity between their customer cone sizes are thus the 
main factors explaining the IK -randomness of the Hk MAX 
subgraph. 

Let us assume for a moment that a large fraction of Umax- 
dense links is present at a single large IXP. We do not have 
access to the peering matrix data of any large IXPs, but the 
data reported in iflOl suggest that this assumption may very 
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well be correct, with DE-CIX being one of possible large IXP 
candidates. If we also assume that the peering policies declared 
in PeeringDB reflect the peering policies that each AS follows 
at each IXP where it has presence, then our explanation of 
Hk MAX 's lif -randomness is supported by the data in Figure 
Hand Table EJe)] 

VI. Conclusions 

In summary, as the Internet grows over time, the maximum 
fc-dense index grows as well. That is, the densest and inner- 
most Internet core Hf~ MAX becomes increasingly denser. Yet 
this form of densification can only partially be attributed to 
the growing average degree. The fluctuations between node 
degree and fc-dense index are strong, and the two statistics 
reflect two different Internet-specific properties of AS nodes. 
High-degree and high-fc-dense-index ASes tend to be transit 
and content providers, respectively. The latter form the densest 
community in the Internet, which only loosely overlaps with 
the high-rank ASes forming the tier-1 core. The structure of the 
Hk MAX core is relatively simple. Statistically, it is almost fully 
determined by its degree distribution, a property that can be 
explained by open peering policies that many content providers 
and NSPs tend to follow at IXPs. Most importantly, after 
proper normalization, all the considered fc-dense properties of 
the Internet appear time-invariant. In particular, a vast majority 
of all AS links in the Internet are attached either to ASs with 
the fc-dense-index equal to 2 or 3, or to the Umax -dense ASes. 

Speaking more generally, the Internet's fc-dense properties, 
derived from a recursive variant of edge multiplicity measuring 
the frequency and density of triangle overlaps, appear to be a 
statistically significant and time-invariant structural properties 
that cannot be fully captured by either degree distribution (1K- 
distribution) or degree correlations (2if-distribution). The 3K- 
distribution, the distribution of subgraphs of size 3, may very 
well capture the Internet's fc-dense properties. Even though 
this conjecture is quite plausible, it is not guaranteed to be 
correct because reproducing the frequency of degree-labeled 
triangles does not automatically imply that the whole fc-dense 
decomposition hierarchy of nested subgraphs Ht is correctly 
reproduced as well. Therefore it is interesting to investigate 
what existing or new Internet topology models and generators 
are capable of explaining or at least reproducing the fc-dense 
properties of the Internet. 
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